Skip to content

feat(cli): add hyperframes auth OAuth (PKCE + loopback + refresh)#1084

Merged
jrusso1020 merged 1 commit into
mainfrom
05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_
May 28, 2026
Merged

feat(cli): add hyperframes auth OAuth (PKCE + loopback + refresh)#1084
jrusso1020 merged 1 commit into
mainfrom
05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 commented May 26, 2026

What

Adds OAuth 2.0 + PKCE login as the default for hyperframes auth login,
plus refresh-token + 401 auto-retry + auth refresh. Stacks on top of
PR #1081 (the API-key + shared store work).

  • hyperframes auth login (no flags) — opens the user's browser to
    /v1/oauth/authorize, captures the code on an ephemeral
    127.0.0.1:<port>/oauth/callback, exchanges it for tokens with
    PKCE S256, and persists. --api-key opts back into the legacy
    long-lived-key path from PR feat(cli): add hyperframes auth login --api-key, status, logout #1081.
  • hyperframes auth refresh — force-refresh the OAuth access token
    using the stored refresh_token. Mostly useful for testing the path.
  • hyperframes auth logout — best-effort revokes via
    POST /v1/oauth/revoke (RFC 7009) before wiping local state.
  • AuthClient now refreshes-and-retries once on a 401 when the
    caller wires onUnauthenticatedRefresh. auth status wires it.

Internals added in packages/cli/src/auth/:

  • pkce.ts — RFC 7636 code_verifier + S256 code_challenge.
  • loopback.ts — ephemeral 127.0.0.1 HTTP server; state validation,
    120s timeout, styled success/error page.
  • browser.ts — wraps open with a BROWSER=none /
    HF_NO_BROWSER=1 fallback that prints the URL.
  • oauth.tsstartAuthorizationCodeFlow, refreshTokens,
    revokeTokens, requireOAuthConfigured, parseTokenResponse.

Why

This is the foundation OAuth flow that lets free-tier users authenticate
without managing a long-lived key. Refresh + auto-retry means CLI
commands keep working past the access_token lifetime without bugging
the user.

The OAuth client_id (q2A2QRSke2LrFTPJhoDbHtXh) is the one James
created in the oauth2_client table. Baked in as a build-time default;
override via HYPERFRAMES_OAUTH_CLIENT_ID for dev/test.

How

  • Public client: PKCE only, no client_secret. Backend already
    requires PKCE (movio/logic/oauth2.py:638).
  • Loopback port is ephemeral (server.listen(0)) — the backend
    wildcards localhost ports for public clients
    (movio/model/oauth2.py:check_redirect_uri), so the registered
    redirect URI's port is a placeholder.
  • State parameter is generated per-flow + validated on callback to
    prevent CSRF.
  • Token-response parsing is permissive on expires_in type (some
    servers return it as a string) but strict on access_token presence.
  • 401 retry happens at the AuthClient.fetchUser layer, not the
    command layer — so future endpoints inherit it for free.
  • persistOAuth merges into the existing store (preserves co-located
    api_key). auth login (API-key path) does the symmetric thing.

Test plan

  • 80 unit tests, all green. vitest run src/auth/.
  • PKCE: verifier within 43-128 chars, challenge = SHA-256, S256
    method, distinct outputs each call.
  • Loopback: state mismatch / IdP error / missing-code / timeout /
    404 non-callback paths all rejected; success path captures code.
  • OAuth: refreshTokens posts correct body, persists, throws
    REFRESH_FAILED on 400/401 and API_ERROR on 5xx. Existing
    api_key preserved on refresh.
  • AuthClient: 401 retries with refreshed bearer on OAuth, does
    NOT retry for api_key, returns 401 if refresh hook fails.
  • bunx oxlint / bunx oxfmt --check / bunx tsc clean.
  • bunx fallow audit --base origin/main --fail-on-issues — only
    inherited help.ts:showUsage finding (from main, not this PR).
  • Smoke test against dev API:
    HEYGEN_API_URL=https://api.dev.heygen.com hyperframes auth login
    then hyperframes auth status then hyperframes auth refresh.

Out of scope

  • Cloud render commands — separate plan.
  • PR 4 (heygen-cli read-side JSON support) — independent, ships after.

Copy link
Copy Markdown
Collaborator Author

jrusso1020 commented May 26, 2026

@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from 2605d59 to 733ea2e Compare May 26, 2026 18:16
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Self-review pass on the OAuth work — addressed 12 of 15 findings. Force-push above includes the fixes.

Fixed (high severity):

  1. persistOAuth wiped refresh_token on no-rotation refresh — RFC 6749 §6 lets the server omit refresh_token in a refresh response, and we were silently dropping the previous one. Now merges the new tokens into the existing oauth block. New test covers this.
  2. redirectUri was rebuilt from req.socket.localAddress — could drift on dual-stack hosts and fail the byte-identical match at /v1/oauth/token. Now the loopback passes the original redirectUri (captured once from server.address().port) through to the token-exchange callsite.
  3. refresh.ts double-persisted with a stale snapshot — dropped the second writeStore; refreshTokens already persists via persistOAuth with merge semantics.
  4. revokeTokens had no timeout — added AbortController with 5s default. A flaky network can no longer hang logout for the OS-level TCP timeout.
  5. revokeTokens ignored response status and only revoked one token — now logs HTTP ≥400 to stderr; logout revokes both refresh_token and access_token with the proper RFC 7009 token_type_hint.

Fixed (medium severity):
6. expires_in ≤ 0 → immediate-refresh loop — clamped to a 30s minimum.
7. parseTokenResponse threw ErrApi(200, …) for shape failures (self-contradictory message, wrong routing through tryRefresh). Now throws ErrRefreshFailed so callers consistently route to "log in again".
8. OAuth access_token / refresh_token not isHeaderSafe-validated — added the same isHeaderSafe check used by the store's JSON read path, so an IdP can't smuggle CRLF into headers via the token-exchange response.
9. runOAuthLogin no rollback on verify failure — wired the same refresh hook as auth status so a transient 401 right after sign-in transparently refreshes-and-retries; verify failure now downgraded to a warning (tokens are valid on disk, force-rolling-back would be worse).
10. revokeTokens called resolveClientId() outside the try — now returns silently when OAuth isn't configured. Best-effort contract holds for any caller.
11. State leaked in console.log of authorize URL — now logs origin + pathname only; the random state stays out of scrollback / CI logs.
12. state !== expectedState — replaced with crypto.timingSafeEqual. Low real risk on loopback but a gratuitous deviation. Also added GET-only method check on the loopback handler.
13. requireOAuthConfigured() exited inline — renamed and split: assertOAuthConfiguredOrExit() (command-entry helper that exits) vs resolveClientId() (throws). Pattern matches the rest of the file.
14. Dynamic-import of same module already statically imported — removed.

Deferred (won't fix in this PR):

  • HEYGEN_API_URL accepts arbitrary hosts — flagged in the prior review too. By design today for dev testing (api.dev.heygen.com), and removing it would block that workflow. Reasonable follow-up: warn on stderr when the URL is non-default, and possibly require --allow-insecure-base-url for non-HTTPS schemes. Tracked separately.
  • process.exit after console.error can drop stderr when piped — would need to await stderr drain across every call site. Real but small-impact; deferring to a focused follow-up.

Tests: 89 unit tests, all green. Lint/format/typecheck/fallow clean.

@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout branch from a72b6ac to 5c81d96 Compare May 26, 2026 20:30
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from 733ea2e to fa489bf Compare May 26, 2026 20:30
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout branch from 5c81d96 to da2c2ea Compare May 26, 2026 23:51
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from fa489bf to 057c373 Compare May 26, 2026 23:51
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PKCE implementation looks correct — S256, proper verifier length, timing-safe state comparison, loopback bound to 127.0.0.1 only. The refresh/revocation flows are well thought out, especially preserving the RT on no-rotation refresh. Good stuff.

Two things worth thinking about:

Expired auth code error messaging — In exchangeCodeForTokens, a non-2xx response throws ErrApi which surfaces as a generic "HeyGen API error (400)". If the authorization code expires during the 120s loopback window (or was already used), the user gets a cryptic message after waiting a while. refreshTokens already special-cases 400/401 into REFRESH_FAILED — doing something similar for code exchange would make the failure a lot less confusing.

Concurrent refresh race — Two simultaneous CLI invocations hitting 401 will both try to refresh. The second one will likely get invalid_grant if the server invalidates the old RT on first use. Totally acceptable for a CLI tool, but if token rotation is ever enabled server-side, the second process silently loses its session. A file advisory lock would prevent it, but that's probably overkill for now — just worth being aware of.

Minor observation: assertOAuthConfiguredOrExit can never actually fire in production since DEFAULT_CLIENT_ID is always non-empty. The guard only protects against someone explicitly setting HYPERFRAMES_OAUTH_CLIENT_ID="". Not a problem, just means the OAUTH_NOT_CONFIGURED error path is effectively dead code.

Looks good to me.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested status: Request changes.

Findings:

  • Blocking/security: OAuth token-endpoint error bodies are surfaced through safeText() without the credential scrubbing used by AuthClient (packages/cli/src/auth/oauth.ts:161, packages/cli/src/auth/oauth.ts:272, packages/cli/src/auth/oauth.ts:372). These requests carry refresh_token, authorization code, and code_verifier in the form body; server/proxy error pages can echo request data. Please share the scrubber or add equivalent redaction here, with a test proving refresh/login errors cannot print token-shaped secrets.
  • Blocking/correctness: persistOAuth() preserves prior refresh_token/scope/token_type whenever the current response omits them (packages/cli/src/auth/oauth.ts:340), but it is also used by fresh authorization-code login (packages/cli/src/auth/oauth.ts:136). A new interactive login that omits refresh_token can pair a new access token with the previous session refresh token, then silently switch back or fail on the next refresh. Preserve missing refresh tokens only for refresh-grant persistence; fresh login should overwrite the OAuth block apart from preserving the co-located API key.
  • Downstack: this PR inherits the API-key rollback issue from #1081, so it should stay blocked until that is fixed too.

@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout branch from da2c2ea to 106b12f Compare May 27, 2026 00:08
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from 057c373 to fca9875 Compare May 27, 2026 00:13
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Thanks both — addressed the feedback. Force-pushed (still signed/verified).

Blocking (vanceingalls):

  • Token-endpoint error bodies not scrubbed — extracted the scrubber into a shared auth/scrub.ts, broadened it to also redact OAuth secrets (refresh_token, code, code_verifier, access_token, id_token, client_secret, token) in both form-encoded and JSON shapes, and wired it into oauth.ts's safeText. Now refresh / code-exchange / revoke error bodies can't print token-shaped secrets. Added scrub.test.ts (6 cases) plus an oauth.test.ts case proving a refresh error body echoing refresh_token=...&code_verifier=... comes out redacted.
  • persistOAuth inheriting refresh_token on fresh login — parametrized with preserveMissing. Fresh authorization-code login (preserveMissing: false) now overwrites the OAuth block entirely (keeping only a co-located api_key), so a response that omits refresh_token can't pair a new access token with the previous session's RT. Refresh grant (preserveMissing: true) keeps the prior RT on a no-rotation refresh. Added two flow tests: fresh login drops the old RT; fresh login preserves a co-located api_key.

Non-blocking (miguel):

  • Expired/used auth-code errorexchangeCodeForTokens now special-cases 400/401 into an actionable ErrRefreshFailed("authorization code rejected …; please run \auth login` again")instead of a bareHeyGen API error (400), mirroring refreshTokens`.
  • Concurrent refresh race — agreed it's acceptable for a CLI; leaving the advisory-lock idea out as you suggested.
  • assertOAuthConfiguredOrExit effectively dead — correct, it only fires on an explicit HYPERFRAMES_OAUTH_CLIENT_ID="". Keeping it as a cheap guard for that override case.

Downstack #1081's rollback issue is also fixed, so this should unblock.

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token exchange error handling addressed — 400/401 on code exchange now surfaces an actionable message instead of a bare API error. Good.

Concurrent refresh race and assertOAuthConfiguredOrExit dead code are cosmetic at this point. Ship it.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested status: Request changes.

The earlier OAuth-specific blockers are fixed: token-endpoint bodies now go through shared scrubCredentials, and fresh auth-code login uses preserveMissing: false while refresh uses preserveMissing: true.

Blocking issue:

  • packages/cli/src/auth/scrub.ts:13 - HEADER_LINE only redacts the first whitespace-delimited word after Authorization:, so Authorization: Bearer at_secret_123 is scrubbed to authorization: <redacted> at_secret_123. Because AuthClient sends OAuth credentials as Authorization: Bearer <access_token>, an upstream/proxy error echoing that header still leaks opaque access tokens. Please redact the full header value and cover this with a test that checks the token after Bearer is gone.

@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout branch from 106b12f to bb59411 Compare May 27, 2026 00:34
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from fca9875 to e0bc7cd Compare May 27, 2026 00:37
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Fixed the header-redaction leak in the shared scrubber — thanks. Force-pushed (still signed/verified).

scrub.ts HEADER_LINE — same fix as downstack: redact the full header value to end-of-line instead of one token, so Authorization: Bearer <access_token> no longer leaks the token after the scheme:

/(authorization|x-api-key)[ \t]*[:=][ \t]*[^\r\n]+/gi  →  "$1: <redacted>"

Redaction stops at the line break, so unrelated following lines survive. Added a scrub.test.ts case asserting the token after Bearer is gone (and that a following line is untouched).

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrub module properly handles all credential-shaped substrings now — form-encoded fields, JSON fields, headers, JWTs. Good.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested status: Approve from code review; wait for Graphite mergeability to finish before merge.

Re-review of e0bc7cd: prior OAuth blockers are fixed. Token endpoint errors use the shared scrubber, fresh authorization-code login no longer inherits a stale refresh_token, refresh still preserves a non-rotated refresh_token, and the full Authorization: Bearer <token> scrub regression is covered.

Verification run locally with Node 22 in PATH:

  • bun run --filter @hyperframes/cli test -- src/auth/client.test.ts src/auth/scrub.test.ts src/auth/oauth.test.ts src/commands/auth/login.test.ts
  • bunx oxfmt --check on touched auth/OAuth files
  • bunx oxlint on touched auth/OAuth files

@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout branch from bb59411 to 9272528 Compare May 28, 2026 05:16
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from e0bc7cd to ed429e1 Compare May 28, 2026 05:18
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch 2 times, most recently from 124bfb8 to 3383bfe Compare May 28, 2026 05:34
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout branch from 9272528 to 8a9291c Compare May 28, 2026 05:39
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from 3383bfe to 83d84f6 Compare May 28, 2026 05:39
miguel-heygen
miguel-heygen previously approved these changes May 28, 2026
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PKCE implementation is spec-correct:

  • Verifier: 64 random bytes → 86 base64url chars (RFC 7636 compliant)
  • Challenge: SHA-256 + base64url, method S256
  • State: 32 random bytes + timingSafeEqual comparison
  • Token storage: fresh login overwrites OAuth block, refresh preserves missing refresh_token (per RFC 6749 §6)
  • Credential scrubbing expanded to cover form-encoded, JSON-encoded, and JWT patterns
  • 401 auto-retry: fires only for OAuth with refresh_token, does not retry the retry

80 tests covering PKCE math, loopback edge cases (state mismatch, IdP error, timeout, 404), token persistence, refresh semantics, scrubber, and 401 retry logic.

Minor nits: assertOAuthConfiguredOrExit + resolveClientId double-checks the same condition; reportIdentity re-reads credentials instead of using the just-returned tokens. Neither blocking.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

OAuth + PKCE foundation looks solid. Module boundaries are clean (pkce / loopback / browser / oauth / scrub / store), and the security-sensitive bits are right: crypto.randomBytes + S256 PKCE, 32-byte state with timingSafeEqual comparison, 127.0.0.1 (not localhost) + ephemeral port, 0600/0700 file modes, code_verifier kept in memory only, no secret logging on the success path, redirect_uri byte-identical across both hops (RFC 6749 §4.1.3). The 80-test suite covers the state-mismatch / IdP-error / missing-code / timeout / non-callback-path matrix cleanly, the _test-utils.ts factor-out is nice, and the refresh / preserveMissing semantics are documented well enough that the next reader won't re-derive RFC 6749 §6.

Three importants + a handful of nits below.


blocker / important

important — auth login will hang on a real browser due to HTTP keep-alive (packages/cli/src/auth/loopback.ts:171-178)

server.close() only refuses new connections; it doesn't terminate existing keep-alive sockets. Browsers issuing the callback default to Connection: keep-alive and idle the TCP connection for minutes (Chrome ~5min). I reproduced this with a stub http.Agent({ keepAlive: true }):

calling server.close()...
server.close callback fired in 0ms
still alive after 4005ms — keep-alive blocks event loop

respond() emits only content-type + cache-control, no Connection: close. End result: after the user sees "✓ Signed in as …", the CLI process hangs for the duration of the browser's keep-alive idle timeout. The unchecked smoke-test box in the PR description would catch this.

Two-line fix in respond():

res.writeHead(status, {
  "content-type": "text/html; charset=utf-8",
  "cache-control": "no-store",
  connection: "close",
})

Or call server.closeAllConnections() (Node ≥18.2) from close() after server.close(…).

important — scrubCredentials misses sk_V2_… keys (packages/cli/src/auth/scrub.ts:12)

The HEYGEN_KEY pattern is hg_[A-Za-z0-9_-]{4,} only, but this codebase already documents that real keys come in sk_V2_…, hg_…, and partner formats (store.ts:241, login.ts:18, store.test.ts:149). The HEADER_LINE regex catches keys when they appear after Authorization:/x-api-key:, but a JSON or text body that echoes a key inline ({"echoed_key":"sk_V2_…"}, a stack trace fragment, etc.) would leak the full key. That's exactly the threat model the scrubber is written for. Add sk_[A-Za-z0-9_-]{4,} (or unified (hg|sk)_…) to the pattern set and add a scrub.test.ts case.

important — 401-retry uses a stale refresh_token for rotated-RT servers (packages/cli/src/auth/client.ts:128-133)

const refreshed = await this.tryRefresh(credential.refresh_token);
if (refreshed) {
  const next: ResolvedCredential = { ...credential, access_token: refreshed };
  return await this.fetchUser(url, next, false);
}

tryRefresh (via refreshTokenspersistOAuth) writes the new refresh_token to disk, but the in-memory credential object reused for the retry still carries the OLD refresh_token. For an IdP that rotates refresh tokens, a subsequent retry on the same credential instance would attempt to refresh with a now-invalidated RT.

Today's only call site (status) doesn't retry twice on the same credential, so this is latent. But the comment on getCurrentUser says "future endpoints inherit it for free" — those future endpoints will eat this. Either change the hook signature to return the full OAuthTokens (or a fresh ResolvedCredential), or have the hook also be responsible for re-reading from store and returning a fresh credential. A test that asserts the post-refresh credential carries the rotated RT would also catch this on the next refactor.


nit

nit — MIN_EXPIRES_IN_SECONDS=30 is less than EXPIRY_SKEW_MS=60_000 (oauth.ts:51, resolver.ts:42)

The clamp's comment says it prevents an immediate-refresh loop, but with the resolver's 60s skew, any expires_in ≤ 90s lands an expires_at the resolver immediately tags as refreshable: true. Today there's no proactive-refresh path so it doesn't actually loop (the 401-retry only fires on a real 401), so this is cosmetic — but if a future change adds proactive refresh on refreshable, you'd hit the loop the clamp is supposed to prevent. Set MIN_EXPIRES_IN_SECONDS = (EXPIRY_SKEW_MS/1000) + 30 or document the dependency.

nit — persistOAuth is read-modify-write without a lock (oauth.ts:400-419)

Two concurrent hyperframes auth refresh or auth login --api-key + auth refresh processes will race: each reads the store, writes a merged result via temp+rename. Last-write-wins. The window is short and writeStore is atomic per file, but you could lose the new refresh_token of one of the two. Not a security issue (just a UX one — user re-runs login), and out of scope for this PR, but worth a tracking comment / future flock.

nit — server.address() as AddressInfo defensive-cast (loopback.ts:72)

server.address() can return string | AddressInfo | null per the type defs. After a successful listen() on an IP/port it'll be AddressInfo, so the cast holds in practice — but a defensive if (!address || typeof address === "string") throw … would surface a programmer error rather than a Cannot read property 'port' of null if listen semantics ever drift.

nit — revokeTokens swallows abort/timeout silently (oauth.ts:248)

The catch {} makes diagnosis of "logout completed but server still thinks I'm logged in" hard. Logout's a rare interactive command — even a c.dim() one-liner ("Could not contact revoke endpoint; local session cleared.") would be helpful. The !res.ok branch already does this; the network-error branch should match.

nit — refresh command exits 1 only on REFRESH_FAILED; other AuthErrors throw (commands/auth/refresh.ts:62-66)

API_ERROR (5xx) bubbles past the isAuthError branch and crashes with a stack trace via the default citty handler. Consider treating any isAuthError(err) as a friendly-exit case here.

nit — auth login no longer rolls back OAuth state on identity-check failure (commands/auth/login.ts:65-78)

The api-key path snapshots + rolls back on 401. The OAuth path persists tokens and then warns "transient" on any verify error including 401. A 401 from /v3/users/me on tokens we just minted is more likely "the IdP issued, the API doesn't yet recognize" — your current "don't roll back, this is transient" is defensible — but you might want a 401-specific branch that surfaces "Signed in but identity check returned 401; run auth status in a moment" rather than the generic warn.


what's right

  • PKCE: randomBytes(64) → 86-char URL-safe verifier, base64url(SHA256(verifier)) challenge, S256 method.
  • State: 256-bit entropy, timingSafeEqual comparison.
  • Loopback: 127.0.0.1 not localhost, ephemeral port, /oauth/callback exact-path match (other paths → 404 without leaking the callback), method allow-list (GET only), cache-control: no-store.
  • Error-page HTML escapes IdP-supplied error/error_description (XSS).
  • redirectUri captured once and reused — no reconstruction from req.socket.localAddress.
  • code_verifier never logged; Opening browser to <host+path> strips the query string deliberately.
  • 0600 file / 0700 dir; atomic temp+rename writes.
  • parseTokenResponse validates header-safety on tokens before they touch a future request.
  • persistOAuth discipline (preserveMissing on refresh, not on fresh login) is correctly reasoned out — landed me on the exact RFC 6749 §6 rotation scenario.
  • Logout revokes refresh_token first then access_token, AbortController with 5s, never throws.

Approve once the keep-alive close fix and the sk_ scrub gap are in.

— Vai

@jrusso1020 jrusso1020 changed the base branch from 05-26-feat_cli_add_hyperframes_auth_login_--api-key_status_logout to graphite-base/1084 May 28, 2026 05:48
@jrusso1020 jrusso1020 force-pushed the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch from 83d84f6 to 81aff68 Compare May 28, 2026 05:59
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Thanks @vanceingalls — all three blockers fixed. Force-pushed (signed/verified).

1. Keep-alive hang on server.close() — fixed two ways:

  • respond() now emits Connection: close so the browser doesn't open a keep-alive socket in the first place.
  • close() calls server.closeAllConnections?.() (Node ≥18.2) before server.close() to terminate any sockets that DID keep-alive.

Verified live: after successful consent, the CLI process exits promptly instead of waiting on the browser's idle timeout.

2. scrubCredentials missed sk_V2_…HEYGEN_KEY is now /\b(hg|sk)_[A-Za-z0-9_-]{4,}/g, replacement preserves the prefix ($1_<redacted>). Added a regression in scrub.test.ts that asserts a bare inline sk_V2_hgu_… echo is redacted (no header anchor).

3. 401-retry used stale refresh_token — hook contract changed from (rt) => Promise<string> (access_token only) to (rt) => Promise<OAuthTokens> (full new token set). fetchUser builds the retry credential with refreshed.refresh_token ?? credential.refresh_token, so a rotated RT is carried into the next call's credential instead of resending the now-invalidated one. Type system enforces the new shape. Added a regression that asserts the hook returns the full token set and the type check prevents the old Promise<string> shape from compiling.

Note on the rebase: #1081 merged earlier today, so this branch now rebases onto 8a9291c4 (the merged #1081 commit reachable from main via the merge). The standalone scrubber sk_V2_… fix from your #1081 comment lands implicitly when this PR merges, since #1084 deletes the inline scrubCredentials in client.ts in favor of the shared scrub.ts module that now has the broader pattern. There's a small transient window between the #1081 merge and this PR merging where the inline scrubber on main still has the hg_-only pattern; happy to file a tiny standalone hotfix for that if you'd like, otherwise it self-resolves on #1084 merge.

Skipped the smaller nits this round (revokeTokens log, refresh command's API_ERROR handling, MIN_EXPIRES_IN_SECONDS coupling) to keep this PR focused on the blockers — happy to do them as follow-ups if you want.

107 tests green, lint/format/typecheck/fallow clean.

@jrusso1020 jrusso1020 changed the base branch from graphite-base/1084 to main May 28, 2026 06:05
@jrusso1020 jrusso1020 dismissed miguel-heygen’s stale review May 28, 2026 06:05

The base branch was changed.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review on 81aff683

All three prior blockers are addressed cleanly.

Keep-alive hang fixed (loopback.ts:86, loopback.ts:178-189). Belt-and-suspenders — Connection: close on the response header so the browser doesn't even try to keep-alive, plus server.closeAllConnections?.() before server.close() to terminate any sockets that did slip through. The inline comment captures the failure mode (Chrome ~5min idle).

sk_V2_… scrub gap closed (scrub.ts:16). HEYGEN_KEY is now \b(hg|sk)_[A-Za-z0-9_-]{4,} so both prefixes redact when echoed inline (not just behind a header anchor). scrub.test.ts:11-20 locks in the exact threat model (sk_V2_hgu_… in a free-text error body).

Stale-RT-on-retry fixed at the type level (client.ts:89, client.ts:140-144). The refresh hook contract changed from Promise<string> (just access_token) to Promise<OAuthTokens>, and the retry rebuilds the credential with the new refresh_token when one is returned. Compile-time enforcement means future callers can't reintroduce the bug. The JSDoc on onUnauthenticatedRefresh explains why ("any subsequent refresh on the same in-memory credential doesn't re-use a now-invalidated rotated RT"), and client.test.ts:197-228 covers the type-shape regression.

CI

All required checks green on 81aff683: Build, Test, Typecheck, Lint, Format, CLI smoke (required), Smoke: global install, Studio: load smoke, Test: runtime contract, Preview parity. Windows shards still pending but not required for this PR.

Carried-forward nits (non-blocking, won't gate)

These all carried over from the prior review and remain optional:

  • MIN_EXPIRES_IN_SECONDS=30 is still less than EXPIRY_SKEW_MS=60_000 (oauth.ts:51). Cosmetic today; relevant if a proactive-refresh path is added.
  • persistOAuth is still read-modify-write without an advisory lock (oauth.ts:400-419). Concurrent auth refresh + auth login could clobber a fresh RT. Miguel called this out too — acceptable as future work.
  • server.address() as AddressInfo cast remains undefended (loopback.ts:72).
  • revokeTokens still swallows abort/network errors silently (oauth.ts:248).
  • auth refresh exits 1 only on REFRESH_FAILED; API_ERROR still bubbles to the default citty handler with a stack trace (commands/auth/refresh.ts:39-46).
  • auth login's reportIdentity still warns "transient" on any verify error, including 401 from the just-minted token. A 401-specific branch would be a touch friendlier.

Approving. Ship it when Graphite mergeability and the optional Windows checks complete.

— Vai

Copy link
Copy Markdown
Collaborator Author

Merge activity

  • May 28, 6:12 AM UTC: Graphite couldn't merge this PR because it failed for an unknown reason (unsigned commits detected).

@jrusso1020 jrusso1020 merged commit b7b8558 into main May 28, 2026
31 checks passed
@jrusso1020 jrusso1020 deleted the 05-26-feat_cli_add_hyperframes_auth_oauth_pkce_loopback_refresh_ branch May 28, 2026 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants